Exploiting Strong Syntactic Heuristics and Co-Training to Learn Semantic Lexicons

نویسندگان

William Phillips

Ellen Riloff

چکیده

We present a bootstrapping method that uses strong syntactic heuristics to learn semantic lexicons. The three sources of information are appositives, compound nouns, and ISA clauses. We apply heuristics to these syntactic structures, embed them in a bootstrapping architecture, and combine them with co-training. Results on WSJ articles and a pharmaceutical corpus show that this method obtains high precision and finds a large number of terms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

برچسب‌زنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه

Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...

متن کامل

Using Co-Composition For Acquiring Syntactic And Semantic Subcategorisation

Natural language parsing requires extensive lexicons containing subcategorisation information for specific sublanguages. This paper describes an unsupervised method for acquiring both syntactic and semantic subcategorisation restrictions from corpora. Special attention will be paid to the role of co-composition in the acquisition strategy. The acquired information is used for lexicon tuning and...

متن کامل

Semantic annotation of biosystematics literature without training examples

This article presents an unsupervised algorithm for semantic annotation of morphological descriptions of whole organisms. The algorithm is able to annotate plain text descriptions with high accuracy at the clause level by exploiting the corpus itself. In other words, the algorithm does not need lexicons, syntactic parsers, training examples, or annotation templates.The evaluation on two real-li...

متن کامل

Learning to iguate

In this paper we show how a natural language system can learn to find the antecedents of relative pronouns. We use a well-known conceptual clustering system to create a case-based memory that predicts the antecedent of a wh-word given a description of the clause that precedes it. Our automated approach duplicates the performance of hand-coded rules. In addition, it requires only minimal syntact...

متن کامل